Voice Modification for Applications in Speech Synthesis
نویسنده
چکیده
A significant part of the work required to create a high quality speech synthesizer is the creation of “synthetic voices”. Reusing an existing voice database and making it sound like a different speaker, or like the same speaker in a different emotional state, or using a different speaking style, is obviously important for increasing the efficiency in creating voice options for a synthesizer. This article reviews techniques to change signal characteristics like pitch and durations, and also spectral modifications. We conclude by assessing the prospects of voice modification in speech synthesis in light of nowavailable advanced machine learning techniques.
منابع مشابه
VLSI implementation of a TSM/FSM algorithm
The time scale modification (TSM) of speech is concerned with the compressing or expanding of audio signals in the time domain without affecting the signals pitch or naturalness. Conversely, the frequency scale modification (FSM) of speech is concerned with altering the pitch and formants of a signal without changing the signal duration. This paper describes a hardware implemented and optimized...
متن کاملCodec integrated voice conversion for embedded speech synthesis
Voice conversion technologies transform individual characteristics of speech patterns while preserving the original content, and can be widely used in speech processing. Considering limited system resources, in particular, of embedded concatenative speech synthesis, voice conversion may reduce the memory consumption of the acoustic database. Voice conversion enables the intra-gender or cross-ge...
متن کاملSpeech Analysis – Synthesis Based on the Ptdft for Voice Conversion
Voice conversion problem became very popular in the world. It has applications in many fields, for example in systems that make use of prerecorded speech, such as voice mailboxes or text-to-speech synthesizers based on acoustic unit concatenation. In such cases, voice modification would be a simple and efficient way to create a desired variety of voices while avoiding recording of different spe...
متن کاملPractical high-quality speech and voice synthesis using fixed frame rate ABS/OLA sinusoidal modeling
This paper describes algorithms developed to apply the Analysis-by-Synthesis/Overlap-Add (ABS/OLA) sinusoidal modeling system to real-time speech and singing voice synthesis. As originally proposed, the ABS/OLA system is limited to unidirectional timescaling, and relies on variable frame length to accomplish time-scale modification. For speech and voice synthesis applications, unidirectional ti...
متن کاملExperiments in voice quality modification of natural speech signals: the spectral approach
Voice quality is currently a key issue in speech synthesis research. The lack of realistic intra-speaker voice quality variation is an important source of concern for concatenation-based synthesis methods. A challenging problem is to reproduce the voice quality changes that are occuring in natural speech when the vocal e ort is varying. A new method for voice quality modi cation is presented. I...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2006